Neural Networks

Neural Network

Input Dimension

Number of Hidden Layers

Output Dimension

A typical Neural Network Structure

For Linear Models, target $Y$ (a number or a vector) are approximated by $$f(X) = W \cdot \hat{X} $$ where $\hat{X}= \begin{pmatrix}1\\X \end{pmatrix}$. A natural way to extend the linear model is to take the composition of a linear model with a non-linear function $\sigma$, $$g(X) =\sigma (W \cdot \hat{X}) $$ Here, $\sigma$ applies entry-wisely to the vector $W \cdot \hat{X}$. This extension gives us a single-layer neural network, the main goal is to approximate the weights matrix $W$ in the training process.

A multilayer neural network model with $k$ hidden layers is of the form $$X \mapsto (g_k \circ \cdots \circ g_2\circ g_1\circ g_0)(X),\qquad k \in \{0,1,2,3,4,5,\cdots\} $$ The weighs matrices $W_i$ associated to each layer are approximated in the training process.

Convolutional Neural Network

Model Loading

Kernels

Identity Kernel
$$\begin{pmatrix}0&0&0\\0&1&0\\0&0&0\end{pmatrix}$$
Sharpening Kernel
$$\begin{pmatrix}0&-1&0\\-1&5&-1\\0&-1&0\end{pmatrix} =\begin{pmatrix}0&0&0\\0&1&0\\0&0&0\end{pmatrix} + \begin{pmatrix}0&-1&0\\-1&4&-1\\0&-1&0\end{pmatrix}$$ The kernel amplifies the difference between adjacent pixels.
Edge Detection Kernel
$$\begin{pmatrix}-1&-1&-1\\-1&8&-1\\-1&-1&-1\end{pmatrix} $$
Blur Kernel
$$\begin{pmatrix}1&2&1\\2&4&2\\1&2&1\end{pmatrix} $$ Often a Gaussian Kernel times a constant.

Neural Networks

Neural Network

Convolutional Neural Network

Kernels

Common Structures

AutoEncoder

Transformers

Diffusion Models

References

Visualization

Implementation Examples